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Abstract 

Background: Conventional power studies possess limited ability to assess the performance of cluster detection 
tests. In particular, they cannot evaluate the accuracy of the cluster location, which is essential in such 
assessments. Furthermore, they usually estimate power for one or a few particular alternative hypotheses and 
thus cannot assess performance over an entire region. Takahashi and Tango developed the concept of extended 
power that indicates both the rate of null hypothesis rejection and the accuracy of the cluster location. We 
propose a systematic assessment method, using here extended power, to produce a map showing the 
performance of cluster detection tests over an entire region. 

Methods: To explore the behavior of a cluster detection test on identical cluster types at any possible location, 
we successively applied four different spatial and epidemiological parameters. These parameters determined 
four cluster collections, each covering the entire study region. We simulated 1,000 datasets for each cluster 
and analyzed them with Kulldorff's spatial scan statistic. From the area under the extended power curve, 
we constructed a map for each parameter set showing the performance of the test across the entire region. 

Results: Consistent with previous studies, the performance of the spatial scan statistic increased with the 
baseline incidence of disease, the size of the at-risk population and the strength of the cluster (i.e., the relative 
risk). Performance was heterogeneous, however, even for very similar clusters (i.e., similar with respect to the 
aforementioned factors), suggesting the influence of other factors. 

Conclusions: The area under the extended power curve is a single measure of performance and, although 
needing further exploration, it is suitable to conduct a systematic spatial evaluation of performance. The 
performance map we propose enables epidemiologists to assess cluster detection tests across an entire 
study region. 
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Resume 

Contexte: Les etudes de puissance ont montre leurs limites dans revaluation des perfornnances des tests de 
detection d'agregats. En raison de la necessite de prendre en compte a la fois la capacite du test a rejeter 
I'hypothese nulle et a localiser correctement I'agregat, la puissance usuelle ne peut refleter la veritable performance 
de ces tests. De plus, ces evaluations ne traitent en general qu'un nonnbre limite d'hypotheses alternatives ignorant 
done le comportement de ces tests sur I'ensennble d'une region d'etude. Takahashi et Tango ont propose le 
concept de puissance etendue qui, au-dela de la puissance usuelle, reflete egalement la precision de localisation de 
I'agregat. Nous proposons une methode d'evaluation systematique, fondee ici sur la puissance etendue, pour 
produire une carte offrant une visualisation synoptique des perfornnances des tests de detection d'agregats sur 
I'ensemble d'une region. 

Methodes: De fagon a explorer le connportement d'un test de detection d'agregats sur un meme type 
d'agregat pour toutes les localisations possibles, nous avons fixe quatre jeux de parannetres spatiaux et 
epidemiologiques, de fagon a simuler quatre collections d'agregats, chacune couvrant I'ensemble de la region 
d'etude. Mille jeux de donnees ont ete simules pour chaque agregat et soumis au scan spatial de Kulldorff. 
A partir de I'aire sous la courbe de puissance etendue, nous avons produit une carte de performance pour 
chaque jeu de parametres. 

Resultats: Conformement aux precedentes etudes, la performance du scan spatial croit avec incidence de 
base de la maladie, la taille de la population a risque et la force de I'agregat (i.e., le risque relatif). Cependant, 
meme pour des agregats tres similaires, la performance du test est heterogene, suggerant influence 
potentielle d'autres facteurs. 

Conclusions: L'aire sous la courbe de puissance etendue est une mesure unique de performance et, bien 
qu'elle necessite des evaluations plus poussees, elle convient a revaluation spatiale systematique de la 
performance. La carte de performance que nous proposons autorise les epidemiologistes a evaluer les tests 
de detection d'agregats sur I'ensemble d'une region d'etude. 



Background 

Spatial clusters can be detected using a wide range of 
statistical tests [1,2], many of which are available in free 
software packages such as R [3,4], Epidemiologists use 
local methods to detect clusters without a priori 
knowledge of their location, and to determine their 
significance. Because these cluster detection tests (CDTs) 
must reveal both the presence and location of clusters, 
performance studies have been constrained by the 
limitations of conventional estimation techniques. For 
example, a CDT may have maximum power for rejecting 
the null hypothesis (cluster absence), yet be incapable of 
accurately locating the simulated cluster. CDT performance 
is also a function of epidemiological and geographical con- 
text [1,5-11]. Furthermore, because epidemiological (e.g., 
incidence and relative risk) and geographical (e.g., spatial 
unit size and shape) factors tend to be intrinsically 
linked, their proper or common effects are difficult to 
evaluate. When evaluating the behavior of these CDTs 
in a particular region, limited knowledge can conse- 
quently be gleaned by simulating one or a few clusters 
in that region, and even less knowledge can be accrued 
from studies on other region. 

Takahashi and Tango have proposed the concept of 
extended power (EP) [12,13] as a more accurate measure 



of CDT performance. This measure assesses both the 
probability that the null hypothesis is rejected and the 
accuracy of the cluster location. As such, it overcomes 
the inadequacy of conventional power measures. How- 
ever, EP cannot eliminate the need to define what is 
meant by "an accurate" or "sufficiently accurate" loca- 
tion. The level of spatial accuracy depends upon context; 
for instance, an epidemiologist will require higher spatial 
accuracy for an ad hoc study than for a survey system. 
Takahashi and Tango therefore introduced a quantitative 
indicator of spatial accuracy, and summarized CDT per- 
formance using an EP curve in conjunction with this 
spatial accuracy indicator. 

In this work, we propose a method that integrates the 
area under the EP curve (AUCep) in order to produce 
maps that provide a global overview of CDT perform- 
ance over an entire study region. 

Methods 

Clustering model 

To explore CDT behavior on same-class clusters in all 
possible locations, we set common spatial and epidemio- 
logical characteristics for four cluster collections cover- 
ing the entire study region. The study region was the 
Auvergne region (France), divided into n = 221 spatial 
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units (SUs) equivalent to U.S. ZIP codes. The exhaustive 
collection of approximately circular clusters with four 
SUs was identified within the study region. To achieve 
this outcome, the 221 SUs were successively associated 
with their three nearest neighbors as defined by 
Euclidian distances between the SU centroids. To obtain 
four cluster collections, we applied four combinations of 
two baseline risks (incidences) and two relative risks to 
the same at-risk population, whose size was estimated by 
mean annual number of live births. 

For a realistic analysis, we used data archived in CEMC 
(birth defects registry for the Auvergne region) and INSEE 
(National Institute of Statistics and Economic Studies) data- 
bases. We collected two categories of data from 1999 to 
2006: all birth defects and cardiovascular birth defects. Both 
datasets were sorted by SU. The number of live births was 
approximated by the number of birth declarations in the 
at-risk population. Global annual incidences of all birth 
defects (lau) and cardiovascular birth defects (lev) were 
estimated as 2.26% and 0.48% of births, respectively. In the 
analysis, we constructed risk combinations of these two 
incidences at relative risks of 3 and 6. 

Datasets 

For each cluster within the four categories (221 x 4), we 
generated 1,000 datasets, i.e., a total of 884,000 datasets. 
Each dataset consisted of 221 rows and 5 columns. The 
rows contained SU coordinates (longitude and latitude), 
observed number of cases, size of the at-risk population 
(i.e., the number of live births) and expected number of 
cases in the specified SU. This last quantity was the 
product of the global incidence (laii or lev) and the at- 
risk population size in the SU. The observed case num- 
bers were assumed as independent Poisson variables 
such that 

f Ho : E(Ni) = Ei.Ni ~ Pois{8i)J = 

\ Hi : E(Ni) = 7Ti,Ni ~ Pois{jTi),JTi = Wsi + £/(l-I), / = 1, n 

where Ni is the observed number of cases, Si denote the 
expected number of cases in the ith SU under the null 
hypothesis of risk homogeneity (Hq) and tti the expected 
number of cases in the ith. SU under the alternative hy- 
pothesis of one simulated cluster (Hi). 6 is the relative 
risk, and I is a binary indicator set to 1 if the ith SU is 
within the simulated cluster, and 0 otherwise. 

Measure of performance 

The extended power was proposed by Takahashi and Tango 
as an improved measure of CDT performance. For a par- 
ticular cluster, global performance is the weighted cumula- 
tive sum of the contribution of each detected cluster in all 
submitted datasets. Here, we summarize the construction 



of the performance indicator. For a more detailed descrip- 
tion, the reader is referred to Takahashi and Tango [12,13]. 

Within a simulated cluster of 5 SUs, if the null hypoth- 
esis is rejected, the size / of a detected cluster and its s'^ 
SUs (where s'^ denotes a subset of s) are recorded. A 
maximum cluster size L is imposed, such that if / > L, 
the detected cluster is discarded. This limit prevents very 
large, meaningless clusters from contributing to CDT 
global performance. In this work, L was set to 30 SUs. 

All eligible detected clusters (EDCs), i.e. with / < L, are 
counted and sorted by / and For each combined value 
of / and s% the proportion of corresponding detected 
clusters {P(is*)) in all submitted datasets is assigned a 
weight W(is*)' This weight is also a function of the 
detection accuracy (i.e., the correct location of the 
simulated cluster). Thus, Takahashi and Tango define 
'^(i,s*,w+,w-) as 

^(/,5*,M/+,M/-) = \/[l-min{w-{s-s''), 1}] [l-mm{w+(/-5*), 1}] 

where w~ and w'^ are penalties for false negative and 
false positive SUs, respectively. The penalties w~ and w'^ 
are determined according to the following constraints. 
For w~, detected clusters that generate no false negative 
must fully contribute to global performance, and those 
that induce s false negatives must be discarded. These 
constraints are satisfied when 

w~ = 1/s 

For w^, detected clusters that generate no false positive 
must fully contribute to global performance, and those 
that induce at least Iq false positives must be discarded. 
These constraints are satisfied when 

M/+ = l/lo 

So that Iq is not assigned arbitrarily, Takahashi and 
Tango specify the ratio 

q = w'^ /w~ 

To favor sensitivity over specificity (as is usually pre- 
ferred), w~ is greater than or equal to w^; thus > 5 
because 1/s > I/Iq. For example, when: 

• lo = 5, TV" = and q = 1; 

• lo = 25, w~ = 2w^ and q = 0.5; 

• lo^°^, = OdLiid q=0. 

For each value of q, the extended power is the cumula- 
tive sum of W(i^s*q) X P(i^sV> where / runs from 1 to L and 
5'' runs from 0 to s. CDT global performance in detecting 
a particular cluster is then represented by the extended 
power curve with q running from 0 to 1. At any point 
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on this curve, the extended power is, by construction, 
between 0 and 1. Furthermore, we note that the ex- 
tended power is a monotonically decreasing function of 
q. Consequently, the area under the extended power 
curve (AUCep), defined by 



AUCep= j {W^i,.^q)XP^i^,.^)dq 

is between 0 and 1, with 0 signifying an inoperative CDT 
(5 always null) and 1 a perfect CDT (Hq always rejected, 
with all detected clusters exactly overlaying the simu- 
lated cluster). As suggested by Takahashi and Tango 
[13], we used the area under the extended power curve 
as the measure of CDT performance. 

Performance mapping 

Global performance was visualized over the entire region 
using maps representing the measured AUCep for each 
collection of clusters. 

The AUCep is a measure of a cluster and thus associ- 
ated with four SUs. In order to obtain a global overview 
on a single map, we assigned the AUCep value of each 
cluster, to its central SU. Thus, we affected a single 
measure of AUCep to each SU of the map. As we 
defined four cluster collections for four risks combin- 
ation (incidence and relative risks), we produced four 
performance maps. 

Kulldorff s Spatial scan statistic 

In this study, we selected Kulldorff s spatial scan 
statistic [14,15], a well-known and widely used CDT 
whose performance has been studied by many authors 
[1,6,10,16]. The spatial scan statistic detects the most 
likely cluster based on locally observed statistics of 
likelihood ratio tests. The scan statistic considers all 
possible zones z defined by two parameters: a center 
that is successively placed on the centroid of each SU, 
and a radius varying between 0 and a predefined max- 
imum. The true geography being delineated by admin- 
istrative tracts, i.e., each zone z defined by all SUs 
whose centroids lie within the circle, is irregularly 
shaped. Let and be the size of the at-risk 
population and the number of cases counted in zone z 
(over the entire region, these quantities are the total 
population size N and the total number of cases n, re- 
spectively). The probabilities that an at-risk case lies 
inside or outside zone z are respectively defined by 
= n/Nz and qz = (j^ - nz)/(N - NJ, Given the null hy- 
pothesis Hq: Pz = qz versus the alternative Hi: pz > qz 
and assuming a Poisson distribution of cases, Kulldorff 
defined the likelihood ratio statistics as proportional to 



where A is global incidence, and the indicator function 
/ equals 1 when the number of observed cases in zone 
z exceeds the expected number under Ho, and 0 other- 
wise. The circle yielding the highest likelihood ratio is 
identified as the most likely cluster. The j^-value is 
obtained by Monte Carlo inference. 

Software 

Data simulation and analysis (see Data and Script in the 
Additional files 1 and 2) were performed in R 2.14.0 
[3,17-19] using AUVERGRID [20]. 

Results 

The Auvergne region is characterized by low and 
medium mountains situated around a central plain. 
The at-risk population (see Methods) was heteroge- 
neously distributed throughout sparsely populated areas 
(mainly borderland and mountainous) and highly po- 
pulated urban areas. Figure 1 shows the size of the 
at-risk population in each cluster, which was assigned 
to its central SU. 

Figure 2 demonstrates how CDT performance im- 
proved with increasing risk level. Clearly, the CDT 
could not detect clusters within regions with low num- 
ber of births. For these clusters, performance only mar- 
ginally improved, even at the highest risk combination 
(Figure 3). 

CDT performance increased monotonically with the 
at-risk population size (Figure 3). We noted a stronger 
heterogeneity of CDT performance for the clusters with 
the largest populations, especially at intermediate risk 
levels (Figure 3); by this, we mean that clusters with 
nearly the same population size led to slightly different 
test performance behaviors. For example. Figure 4 shows 
test performance in detecting three clusters centered on 
SUs "43770" (red cluster in the figure), "03700" (blue 
cluster) and "03420" (green cluster), which had popula- 
tion sizes of 544, 558 and 545 births (mean number over 
8 years), respectively. At the lowest risk level, the red 
cluster was the only one even marginally detected, 
whereas under other configurations, the blue cluster was 
best detected. The worst detection performance was ex- 
hibited with respect to the green cluster, particularly at 
intermediate risk levels. We note that the green cluster 
was the only borderland cluster. 

Some summary statistics of the AUCep distribu- 
tions are displayed in Table 1. Figure 5 shows two 
different extended power curves (and thus two dif- 
ferent CDT behaviors) that have nearly equal AUCep- 
One of these clusters was centered on SU "03160", 
the other on SU "63112". 



Guttmann et al. International Journal of Health Geographies 2013, 12:47 
http://www.ij-healthgeographics.conn/content/1 2/1/47 



Page 5 of 10 




Generation of one performance map from 221,000 
datasets required about 5 days of computational time 
using the AUVERGRID grid. 

Discussion 

Takahashi and Tango [13] have suggested using the 
AUCep to compare performance between CDTs. We 
used this synthetic indicator, suitable for compiling 
maps, to describe CDT performance. It thus fulfills our 
primary goal of realizing a systematic performance as- 
sessment of a CDT over an entire study area, rather than 
over only a few clusters. This mapping method, although 
using Takahashi and Tangos extended power, is not 
dependent on this concept. Our method can use any 
other indicator that meets the requirements of being a 
scalar (i.e., a single measure of performance) indicating 
both the spatial accuracy of the detection and the 
capacity of cluster detection tests to reject the null 
hypothesis. 



Interpretation of the AUCep requires further explor- 
ation, however. Although a higher AUCep clearly signi- 
fies stronger CDT performance, quite different behaviors 
can yield the same AUCep- As shown in Figure 5, differ- 
ent curves can possess very similar AUCep values. This 
figure shows the extended power curves "03160" and 
"63112", whose AUCep values are nearly equal (0.931 
and 0.932, respectively), but which reflect different CDT 
behaviors. The procedures used to construct these 
curves are described in detail within separate spread- 
sheets (see EP curve in the Additional file 3). 

The curve "63112" is nearly horizontal, indicating that 
the EDCs (Hq rejected, and cluster size / < maximum 
cluster size L) located the simulated cluster with high ac- 
curacy. As q increases, less tolerance is given to false 
positives until, eventually, only EDCs with at least one 
true positive and less than 5 false positives can contrib- 
ute to the extended power. A near zero slope thus 
indicates that the same detected clusters, all of which 
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Figure 2 AUCep of Kulldorff's spatial scan. AUCep was measured for four combinations of two relative risl< (RR) and two annual incidence of 
birth defects: low RR = 3 and high RR = 6; low incidence = 0.48% births and high incidence = 2.26% births. 



contain less than s false positives, contribute to the 
extended power, regardless of q. 

The intercept of curve "63112" is 0.939, meaning that 
eligible clusters {l< L), all of which contribute to the 
extended power (i.e., all clusters contain at least one 
true positive), were detected in 93.9% of the tests (Hq 
rejected). 

To summarize curve "63112", the simulated cluster 
was not always detected (no Hq rejection or EDC with- 
out true positive); however, provided that an EDC identi- 
fied at least one true positive, the location was accurate 
(i.e., less than s false positives existed in the cluster). 

In contrast, the curve "03160" yields the same AUCe^ 
but is negatively sloped with an intercept of 0.951. Thus, 
the associated CDT produced more EDCs containing at 
least one true positive. The negative slope indicates that 



a higher proportion of these EDCs generated at least s 
false positives. 

To summarize curve "03160", the test rejected Hq 
more often and/or produced more EDCs, but located 
the simulated cluster with less accuracy (i.e., this analysis 
produced more than s false positives). 

One particular curve has intercept equal to 1 (^ = 0) 
and a zero slope. An intercept equal to 1 implies that 
the CDT always rejects Hq and that no false negatives 
exist in the EDCs. All detected clusters entirely overlap 
the simulated cluster, as in all other cases the weighting 
function W(i^ s% q=o) is less than one. In addition, the zero 
slope indicates the perfect test that always exactly locates 
the simulated cluster. A perfect test always rejects Hq, 
and detected clusters always satisfy l = s'^ = s (i.e., gener- 
ate no false positive or negative). The AUCep of a perfect 
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Figure 3 AUCep of Kulldorff's spatial scan based on tiie size of the at-risic population for four combinations of two relative risk (RR) 
and two annual incidence of birth defects: low RR = 3 and high RR = 6; low incidence = 0.48% births and high incidence = 2.26% births. 



test equals one, because in all other cases W(i^ 5* ^; is less 
than one. 

The intercept of an extended power curve can be 
regarded as a "quantitative" feature of CDT perform- 
ance (all EDCs generating true positives contribute 
to the extended power), whereas the slope may be 
thought of as a "qualitative" feature of CDT perform- 
ance, assessing location accuracy. The parameter q 
can, in fact, be regarded as a continuous indicator 
reflecting to what extent a detected cluster must accur- 
ately locate the simulated cluster to contribute to the 
performance measure. 



As shown in Figure 5, however, if an entire curve is 
condensed into a single measure (such as the AUG), 
some information is lost, because CDTs with different 
behaviors (i.e., curves with different shapes) can yield 
the same performance value. 

Consequently, the impact of CDT behavior on the ex- 
tended power curve must be thoroughly explored, and 
behaviors relevant to a particular research or application 
need to be defined. Through such exploration, the extent 
to which the AUCep is a relevant performance measure, 
and the purposes for which it is most suited, can be 
determined. 
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Figure 4 AUCep of Kulldorff's spatial scan and locations of three simulated clusters for four combinations of two relative risk (RR) and 
two annual incidence of birth defects: low RR = 3 and high RR = 6; low incidence = 0.48% births and high incidence = 2.26% births. 
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Table 1 AUCep distribution for each risk combination and 



category of at-risk population size 


Risk 


Number of 


AUCep 




combination 


births^ 


Mean (SD) 


Min 


- Max 


lev and RR = 3 


< 102 


0.010 (0.003) 


0.003 


- 0.020 




[102, 175] 


0.021 (0.006) 


0.007 


- 0.033 




[1 75, 293] 


0.043 (0.013) 


0.023 


- 0.077 




> 293 


0.133 (0.089) 


0.055 


- 0.542 


laii and RR = 3 


< 102 


0.070 (0.028) 


0.019 


-0.138 




[102, 175] 


0.183 (0.038) 


0.119 


- 0.268 




[1 75, 293] 


0.382 (0.075) 


0.246 


- 0.543 




> 293 


0.713 (0.117) 


0.492 


- 0.950 


lev and RR = 6 


< 102 


0.061 (0.025) 


0.016 


-0.110 




[102, 175] 


0.185 (0.047) 


0.114 


- 0.297 




[1 75, 293] 


0.412 (0.083) 


0.277 


- 0.553 




> 293 


0.768 (0.113) 


0.524 


- 0.971 


laii and RR = 6 


< 102 


0.511 (0.162) 


0.168 


- 0.787 




[102, 175] 


0.874 (0.050) 


0.783 


- 0.959 




[1 75, 293] 


0.970 (0.019) 


0.915 


- 0.995 




> 293 


0.990 (0.010) 


0.964 - 1 



^mean number between 1999 and 2006. 



The EP has the advantage of requiring only one arbi- 
trarily set parameter. In this work, the parameter L, that 
determines the maximum allowed size for EDCs, has 
been set to 30 SUs. Takahashi and Tango [12] initially 
proposed to set the limit L to one fourth or one third of 
region size (in numbers of SUs). The authors stated that 
it was not unreasonable to assume that an actual cluster 
size will be less than such a limit. Such arguments are 
often open to dispute but in any case, it is an arbitrary 
decision. In our view, it would be more correct to set L 
according to the size s of the simulated cluster because, 
in the simulation, it is the "real" cluster. By construction, 
the consequences of this arbitrary setting are limited to 
the lowest values of q. Indeed, low values of q mean that 
EDCs with false positives are less penalized, and thus 
large clusters are allowed to contribute to EP. In our 
case {L = 30), only values of extended power for q<0,15 
could be underestimated, and only if we consider that 
detected clusters more than 7.5 times larger than the 
simulated cluster (4 SUs) are still meaningful. At last, 
compared with L set to 30, computing AUCep with L 
equal to 221 (i.e. without an arbitrary limit) yields a dif- 
ference in AUCep always less than 10'^ in this work. 

In producing our performance map, we chose to as- 
sign the AUCep value of a single cluster of four SUs to a 
single SU. Because two clusters centered on neighboring 
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Figure 5 Extended power curves for two simulated clusters. Line 03160: cluster centered on the SU with zip code 03160 (northwest 
Auvergne); line 631 12: cluster centered on the SU with zip code 631 12 (central Auvergne). Both clusters were simulated with a relative risk of 6 
and a baseline incidence of birth defects set to 2.26%. 
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SUs likely contain common SUs, and the AUCep evalu- 
ates the detection of the entire cluster, visualizing 
performance on a single map can only be done in two 
ways. On the one hand, the AUCep of a cluster can be 
assigned to each of its SUs, or on the other hand, it can 
be assigned to a single, albeit arbitrarily chosen, SU. In 
the first solution, as each SU has a strong probability to 
be associated with more than one cluster, it is then ne- 
cessary to compute a summary statistic, such as the 
mean, to produce a single map. In our view, it seems 
more comprehensible to arbitrary assign the perform- 
ance measure for the whole cluster on a single SU. As 
we simulated more or less circular clusters, the central 
SU of the cluster was naturally chosen for this assign- 
ment. When simulating different cluster shapes, this 
choice will clearly be less obvious. We nevertheless rec- 
ommend assigning the performance measure to the SU 
where the centroid of the cluster is located. 

Authors who have studied CDT behavior mentioned 
its dependence on epidemiological and geographical fac- 
tors [1,5-11]. Consistent with previously published re- 
sults, the performance of KuUdorff s spatial scan, and 
more generally, all local CDTs, improves in study re- 
gions of small SUs, large populations, high incidence of 
the studied phenomenon and for clusters with strong 
relative risk. Furthermore, as shown in Figure 4 and 
Table 1, the variation in AUCep among very similar sim- 
ulated clusters (identical length, shape, population size 
and risk association) suggests that other factors influ- 
ence CDT performance. To our knowledge, no other 
simulation study has been performed to both assess and 
visualize CDT performance over an entire region. Until 
now, authors have always considered a limited set of 
simulated clusters with particular epidemiological or 
geographical characteristics of interest. Consider the 
typical example of population size effect. To assess this 
effect, clusters are generally simulated in only a few arbi- 
trarily chosen locations where a CDT behavior is as- 
sumed to be representative of its behavior in any other 
"similar" location. Usually, clusters in rural areas are 
compared with clusters in urban areas. Such studies are 
not sufficient to assess this factor that, as we have shown 
(Figure 3), has a strong relationship with CDT perform- 
ance. Furthermore, population size cannot explain in it- 
self all the variability in CDT performance. 

However, some authors [21] have assessed performance 
on many randomly located clusters, which is a way to take 
into account the effect of spatial location without assessing 
it. It enabled them to assess the effect of factors such as 
relative risk or spatial resolution without the potential con- 
founding effect of the spatial location. Still, this approach, 
while accounting for this effect, cannot quantify^ it. 

Our systematic evaluation allows us to assess exactly 
when heterogeneity is most important, and thus within 



what population size range we can expect any other po- 
tential factor to have a maximum effect. In this work, we 
used predefined values for incidence and clustering 
characteristics (relative risk, shape, size and number) to 
generate performance maps. Epidemiologists should use 
reasonable values if a priori knowledge is available for 
some factors. However, the proper effect of any factor 
on CDT performance can be studied with this systematic 
evaluation, provided it uses suitable measure such as the 
AUCep. 



Conclusion 

Given that CDT performance depends on geographical 
and epidemiological context, the performance of these 
methods should be explored prior to monitoring a par- 
ticular phenomenon in a given region. This work enables 
epidemiologists to study global CDT performance over 
an entire region. Furthermore, from a research view- 
point, our method seems beneficial for unraveling the 
proper effect of many factors, particularly geographical 
ones, on CDT performance. 

Additional files 



Additional file 1: Script: This file is an r script (script.r) containing a 
complete procedure to define the collection of clusters, simulate 
the datasets, perform the test and plot the corresponding 
performance map. 

Additional file 2: Data: This is a zip file (Data.zip) containing the 
population data in an r format (Pop.rda) and a folder with the 
shapefiles for the Auvergne region. 

Additional file 3: EP curve: This file is an Excel spreadsheet 
(EP curve.xls) containing two worksheets. Sheets "03160" and "631 12" 
describe step-by-step construction of EP curves for clusters centered on 
SU "03160" and SU "63112'", respectively. In both constructions, the 
relative risk is set to 6 and the baseline incidence of birth defects is 
assumed to be 2.26%. To toggle between the corresponding procedures 
for calculating EP, the user need only alter the value of in cell D41. 
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Competing interests 

The authors declare that they have no competing interests. 
Authors' contributions 

AG and LO conceived the design, performed the study and drafted the 
manuscript. AG was responsible for statistical programming and data 
analysis. JD, JG, IP, XL and JYB contributed to manuscript revision. All authors 
read and approved the final manuscript. 

Acknowledgments 

The authors are very grateful to Dr. Francannet who granted access to the 
CEMC database. We thank Paul De Vlieger who provided access and 
technical support for AuverGrid on behalf of the particle physics laboratory, 
Blaise Pascal University. 



Guttmann et al. International Journal of Health Geographies 2013, 12:47 
http://www.ij-healthgeographics.conn/content/1 2/1/47 



Page 10 of 10 



Author details 

^Department of Biostatistics, Medical Informatics and Communication 
Technologies, Clermont University Hospital, Clermont-Ferrand F-63000, 
France. ^ISIT, UMR CNRS UDA 6284, Auvergne University, Clermont-Ferrand 
F-63001, France. ^PEPRADE, EA 4681, Clermont-Ferrand F-63000, France. 
^SESSTIM, UMR 912 INSERM IRD AMU, Aix-Marseille University, Marseille 
F-13005, France. ^Biostatistics Unit, Assistance Publique Hopitaux de Marseille, 
Marseille F-13005, France. ^AGIM, FRE CNRS 3405, J. Fourier University, La 
Tronche University School of Medicine, Grenoble F-38700, France. 



Received: 31 July 2013 Accepted: 15 October 2013 
Published: 25 October 2013 



References 



10. 



12. 



14. 
15. 



16. 



17. 



19. 



20. 
21. 



Kulldorff M, Tango T, Park PJ: Power comparisons for disease clustering 
tests. Comput Stat Data Anal 2003, 42:665-684. 
Sankoh OA, Becher H: Disease cluster methods in epidemiology and 
application to data on childhood mortality in rural Burkina Faso. 

Inform Biom Epidemiol Med Biol 2002, 33:460-472. 

Gomez-Rubio V, Ferrandiz J, Lopez A: Detecting clusters of diseases 

with R. Proc DSC 2003:2. 

Robertson C, Nelson TA: Review of software for space-time disease 
surveillance. Int J Heal Geogr 2010, 9:16. 

Aamodt G, Samuelsen SO, Skrondal A: A simulation study of three 
methods for detecting disease clusters. Int J Heal Geogr 2006, 5:15. 
Ozonoff A, Jeffery C, Manjourides J, White LF, Pagano M: Effect of spatial 
resolution on cluster detection: a simulation study. Int J Heal Geogr 
2007, 6:52. 

Jeffery C, Ozonoff A, White LF, Nuno M, Pagano M: Power to detect spatial 
disturbances under different levels of geographic aggregation, j/lm Med 
Informatics Assoc JAMIA 2009, 16:847-854. 
Olson KL, Grannis SJ, Mandl KD: Privacy protection versus cluster 
detection in spatial epidemiology. Am J Public Heal 2006, 96:2002-2008. 
Puett R, Lawson A, Clark A, Aldrich T Porter D, Feigley C, Hebert J: Scale 
and shape issues in focused cluster power for count data. Int J Heal 
Geogr 2005, 4:8. 

Goujon-Bellec S, Demoury C, Guyot-Goubin A, Hemon D, Clavel J: 
Detection of clusters of a rare disease over a large territory: performance 
of cluster detection methods. Int J Heal Geogr 201 1, 10:53. 
Jacquez GM: Cluster morphology analysis. Spat Spatio-Temporal Epidemiol 
2009, 1:19-29. 

Tango T, Takahashi K: A flexibly shaped spatial scan statistic for detecting 

clusters. Int J Heal Geogr 2005, 4:1 1 . 

Takahashi K, Tango T: An extended power of cluster detection tests. Stat Med 

2006, 25:841-852. 

Kulldorff M: A spatial scan statistic. Commun Stat Theor M 1997, 26:1481-1496. 
Kulldorff M, Nagarwalla N: Spatial disease clusters: detection and 
inference. Stat Med 1995, 14:799-810. 

Ribeiro SHR, Costa MA: Optimal selection of the spatial scan parameters 
for cluster detection: a simulation study. Spat Spatio-Temporal Epidemiol 
2012, 3:107-120. 

Cici C, Kim AY, Ross M, Wakefield J, Venkatraman ES: SpatialEpi: Performs 
various spatial epidemiological analyses. R package version 1. 7; 2013. 
http://CRAN.R-project.org/package=SpatialEpi. 
Team RC: R: A language and environment for statistical computing. Vienna, 
Austria: R Foundation for Statistical Computing; 2012. httpZ/www.R-project.org/. 
KeittTH, Bivand Pebesma E, Rowlingson B: Rgdal: Bindings for the Geospatial 
Data Abstraction Library. 2012. http://CRAN.R-project.org/package=rgdal. 
AuverGrid. http://www.a u ve rg ri d .f r/. 

Jones SG, Kulldorff M: Influence of spatial resolution on space-time 
disease cluster detection. PLoS One 2012:7. 



doi:1 0.1 1 86/1 476-072X-1 2-47 

Cite this article as: Guttmann et al.: Performance map of a cluster 
detection test using extended power. International Journal of Health 
Geographies 2013 12:47. 



Submit your next manuscript to BioMed Central 
and take full advantage of: 

• Convenient online submission 

• Thorough peer review 

• No space constraints or color figure charges 

• Immediate publication on acceptance 

• Inclusion in PubMed, CAS, Scopus and Google Scholar 

• Research which is freely available for redistribution 



Submit your manuscript at 
www.biomedcentral.com/submit 



o 



BioMed Central 



